ECON 707/807: Econometrics II

Course Introduction

Evie Zhang

Old Dominion University

Time Series

Sequential measurements or values of a single entity over time.

Why study time series data?

  • Gross Domestic Product
  • Unemployment
  • Vehicle Demand
  • Energy Consumption

Forecasting

“Predict” outcomes of a time series in future (unobserved) periods.

Time Series Models

\[Y_{t} = f(Y_{t-1}, X_{t}, X_{t-1}, \tau_t, C_t, S_t)\] where:

  • \(Y_{t-1}\) is a lagged value of \(Y\)
  • \(X_t\) is a contemporaneous independent value
  • \(X_{t-1}\) is a lagged value of \(X\)
  • \(\tau\) is the trend
  • \(C\) is the cycle
  • \(S\) is the season

Statistics Review

  • mean
  • variance
  • standard error
  • central limit theorem (CLT)
  • p-value
  • \(Y_i = \alpha + \beta X_i + \epsilon_{i}\)
  • causality
  • spurious

Forecasting Steps

  1. Define Problem

    What am I trying to solve?

  2. Gather Data

    FRED, WRDS, Etc.

  3. Exploratory Data Analysis (EDA)

    Plot, Plot, Plot

  4. Choose & Fit a Model

    \(Y_{t} = f(Y_{t-1}, X_{t}, X_{t-1}, \tau_t, C_t, S_t)\)

  5. Evaluate, Forecast

Packages and Functions

  • dplyr
  • tidyverse
  • ggplot2
  • group_by(), summarise(), left_join()
  • paste(), paste0(), substr(), grepl(), gsub(), regexpr(), strsplit(), unlist()

Packages and Functions

  • for()
  • if(), else
  • ifelse()
  • fixest
  • scales::alpha()
  • data.table::fread()
  • readRDS(), saveRDS()

Packages and Functions

  • lubridate1
  • tsibble
  • forecast
  • ts()2
  • duplicated()

Install and Load Packages

Code
install.packages("lubridate")
library("lubridate")

Make a Time Series

Code
urate <- read.csv("../data/unrate_us.csv")
colnames(urate) <- c("date", "urate_t")
head(urate)
        date urate_t
1 1948-01-01     3.4
2 1948-02-01     3.8
3 1948-03-01     4.0
4 1948-04-01     3.9
5 1948-05-01     3.5
6 1948-06-01     3.6

Plot a Time Series

Code
urate <- urate %>%
  mutate(date = ymd(date))

urate %>%
  ggplot(aes(x = date, y = urate_t)) +
  geom_point() +
  labs(
    title = "Monthly Unemployment Rate",
    x = "Month",
    y = "Unemployment Rate"
  ) +
  theme_minimal() +  # Simpler background theme with x and y axis
  theme(
    plot.title = element_text(hjust = 0.5),  # Centering title
    axis.text.x = element_text(angle = 0, hjust = 1, color = "black"),
    axis.text.y = element_text(color = "black")
  )

Plot a Time Series

Code
ggplot(urate, aes(x = date, y = urate_t)) +
  geom_line(color = "dodgerblue") +
  labs(
    title = "Monthly Unemployment Rate",
    x = "Time",
    y = "Unemployment Rate"
  )  + 
  theme_minimal() +  # Simpler background theme with x and y axis
  theme(
    plot.title = element_text(hjust = 0.5),  # Centering title
    axis.text.x = element_text(angle = 0, hjust = 1, color = "black"),
    axis.text.y = element_text(color = "black")
  )

Lags (Leads)

        date urate_t
1 1948-01-01     3.4
2 1948-02-01     3.8
3 1948-03-01     4.0
4 1948-04-01     3.9
5 1948-05-01     3.5
6 1948-06-01     3.6
Code
library(tidyverse)

urate <- urate %>%
  mutate(
    urate_tm1 = lag(urate_t, order_by = date),
    urate_tp1 = lead(urate_t, order_by = date)
  )

head(urate)
        date urate_t urate_tm1 urate_tp1
1 1948-01-01     3.4        NA       3.8
2 1948-02-01     3.8       3.4       4.0
3 1948-03-01     4.0       3.8       3.9
4 1948-04-01     3.9       4.0       3.5
5 1948-05-01     3.5       3.9       3.6
6 1948-06-01     3.6       3.5       3.6

What about panels?

Code
library(tidyverse)

df <- tibble(
  state = c(rep("NY", 10),
            rep("VA", 10),
            rep("CA", 10)),
  year = rep(2010:2019, 3),
  var = rnorm(30)
)

df %>%
  filter(year %in% 2010:2012)
# A tibble: 9 × 3
  state  year      var
  <chr> <int>    <dbl>
1 NY     2010 -0.456  
2 NY     2011  0.935  
3 NY     2012 -0.00928
4 VA     2010  0.0632 
5 VA     2011  1.61   
6 VA     2012  0.598  
7 CA     2010 -0.924  
8 CA     2011 -0.581  
9 CA     2012  1.76   

What about panels?

What about panels?

Code
df <- df %>%
  group_by(state) %>%
  mutate(var_lag1 = lag(var, n = 1, default = NA)) %>%
  ungroup()

df %>%
  filter(year %in% 2010:2012)
# A tibble: 9 × 4
  state  year      var var_lag1
  <chr> <int>    <dbl>    <dbl>
1 NY     2010 -0.456    NA     
2 NY     2011  0.935    -0.456 
3 NY     2012 -0.00928   0.935 
4 VA     2010  0.0632   NA     
5 VA     2011  1.61      0.0632
6 VA     2012  0.598     1.61  
7 CA     2010 -0.924    NA     
8 CA     2011 -0.581    -0.924 
9 CA     2012  1.76     -0.581 

Multiple Plots

Code
ggplot(df, aes(x = as.factor(year), y = var)) +
  geom_point()  + 
  labs(
    title = "",
    x = "Year",
    y = "Var Value"  # Modified y-axis label
  ) +
  theme_minimal() +  # Simpler background theme with x and y axis
  theme(
    plot.title = element_text(hjust = 0.5),  # Centering title
    axis.text.x = element_text(angle = 0, hjust = 1, color = "black"),
    axis.text.y = element_text(color = "black")
  )

Multiple Plots

Code
library(ggplot2)

ggplot(df, aes(x = as.factor(year), y = var, group = state, color = state)) +
  geom_point() +
  scale_color_brewer(name = "State", palette = "Set1") + # Different set of colors
  labs(
    title = "",
    x = "Year",
    y = "Var Value"  # Modified y-axis label
  ) +
  theme_minimal() +  # Simpler background theme with x and y axis
  theme(
    plot.title = element_text(hjust = 0.5),  # Centering title
    axis.text.x = element_text(angle = 0, hjust = 1, color = "black"),
    axis.text.y = element_text(color = "black")
  )

Multiple Plots

Code
ggplot(df, aes(x = as.factor(year), y = var, group = state, color = state)) +
  geom_line() +
  scale_color_manual(name = "State", values = c("NY" = "#FF9999", "VA" = "#99CCFF", "CA" = "#99FF99")) +  # Pastel color scheme
  labs(
    title = "",
    x = "Year",
    y = "Var"
  ) +
  theme_minimal() +  # Simpler background theme with x and y axis
  theme(
    plot.title = element_text(hjust = 0.5),  # Centering title
    axis.text.x = element_text(angle = 0, hjust = 1, color = "black"),
    axis.text.y = element_text(color = "black")
  )

Multiple Plots

Code
df %>%
  filter(state == "NY") %>%
  ggplot(aes(x = as.factor(year), y = var, color = state)) +
  geom_line(group = 1) +
  scale_color_manual(values = c("NY" = "#FF9999")) + # Soft pink color for NY
  labs(
    title = "",
    x = "Year",
    y = "Var"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5),  # Centering title
    axis.text.x = element_text(angle = 0, hjust = 1, color = "black"),
    axis.text.y = element_text(color = "black")
  )

Multiple Plots

Code
df %>%
  ggplot(aes(x = as.factor(year), y = var, color = state, group = state)) +
  geom_line() +
  scale_color_brewer(palette = "Pastel1") + # Using the "Pastel1" palette for milder colors
  labs(
    title = "",
    x = "Year",
    y = "Var"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5),  # Centering title
    axis.text.x = element_text(angle = 0, hjust = 1, color = "black"),
    axis.text.y = element_text(color = "black")
  )

Multiple Plots

Code
df %>%
  ggplot(aes(x = as.factor(year), y = var, color = state)) +
  geom_line(group = 1) +
  scale_color_brewer(palette = "Pastel1") + # Using the "Pastel1" palette for milder colors
  labs(
    title = "",
    x = "Year",
    y = "Var"
  ) +
  facet_wrap(~ state, scales = "free_y") + # Separate graphs based on state
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5),  # Centering title
    axis.text.x = element_text(angle = 45, hjust = 1, color = "black"),
    axis.text.y = element_text(color = "black")
  )

Next Class

  1. \(E[y]\), \(\hat{y}\)
  2. Loss Functions
  3. Lags, Leads
  4. Conditional Forecasts
  5. Time Series Components

Practice - Cleaning Data

  1. Work on #5.

  2. Work on #8.

Practice - Social Security

  1. Navigate to the following link: here
  2. Download the state-specific data.
  3. Read in the data from Virginia.
    • Make some plots for a name of your choosing (time series, distributions, etc.)
  4. Read in all of the data
    • Combine into one data.frame.
    • What state is your chosen name “most” popular?
    • What year is your chosen name “most” popular?
    • Analyze the “gender neutrality” of a relatively unisex name (e.g. “Alex”).
    • Analyze a common name and it’s nickname (e.g. “Alex” vs “Alexander”).